Fast Algorithms for RNA Secondary Structure Prediction

نویسنده

  • Sanjay Vaghela
چکیده

RNA secondary structure prediction with pseudoknots is important, since pseudoknots are part of functionally improtant RNAs in cells. State of the art dynamic programming algorithms due to Akutsu et al [7] and Deogun et al [8] perform well on single RNA sequences. Our aim of this project is to be able to predict secondary structure of real life RNA sequences, which can be more than 700 nucleotides length. Recurrence for solving simple pseudoknots involve O(n) time and O(n) complexity. Existing dynamic programming algorithms solve this recurrence over full length of RNA sequence, which make them impractical for sequences of length greater than 500 nucleotides. We propose novel two-stage dynamic programming algorithm which can overcome this drawback by predicting regions of pseudoknots in first stage and predicting actual structure of pseudoknots in second stage. Another scalable approach HxMatch, by Witwer et al [15] predicts secondary structure of consensus sequence of multiple aligned RNA sequences. HxMatch uses Maximum Weighted Matching (MWM) [30], which has O(n) time complexity, for choosing basepairs based on score-matrix. Few observations suggest that score-matrix is so good that even simpler algorithms can be used for choosing basepairs rather than MWM. We propose to use Random Sampling algorithm with time complexity of O(n logn) for choosing basepairs, instead of MWM. We performed experiments on Pseudobase dataset and other real life dataset taken from Rfam and NCBI. Twostage algorithm outperforms both Deogun’s algorithm and Rivas’ algorithm in terms of time, however, prediction accuracy is less than Deogun’s algorithm but better than Rivas’ algorithm. Random Sampling algorithm performs as good as MWM for most of the sequences.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

PreRkTAG: Prediction of RNA Knotted Structures Using Tree Adjoining Grammars

Background: RNA molecules play many important regulatory, catalytic and structural <span style="font-variant: normal; font-style: norma...

متن کامل

SCARNA: fast and accurate structural alignment of RNA sequences by matching fixed-length stem fragments

MOTIVATION The functions of non-coding RNAs are strongly related to their secondary structures, but it is known that a secondary structure prediction of a single sequence is not reliable. Therefore, we have to collect similar RNA sequences with a common secondary structure for the analyses of a new non-coding RNA without knowing the exact secondary structure itself. Therefore, the sequence comp...

متن کامل

An Algorithm for Template-Based Prediction of Secondary Structures of Individual RNA Sequences

While understanding the structure of RNA molecules is vital for deciphering their functions, determining RNA structures experimentally is exceptionally hard. At the same time, extant approaches to computational RNA structure prediction have limited applicability and reliability. In this paper we provide a method to solve a simpler yet still biologically relevant problem: prediction of secondary...

متن کامل

MMKnots: A max-margin model for RNA secondary structure prediction including pseudoknots

Motivation: The ideal algorithm for the prediction of pseudoknotted RNA secondary structures will provide fast and accurate predictions for pseudoknots of arbitrary complexity. However, existing algorithms are typically lacking on one of these three axes. Energy-based methods suffer from the intractability of pseudoknotted structure prediction under realistic energy models, while statistical ap...

متن کامل

An Alignment Algorithm by Matching Fixed-Length Stem Fragments for Comparing RNA Sequences

The functions of non-coding RNAs are strongly related to their secondary structures, but it is known that a secondary structure prediction of a single sequences is not reliable. Therefore, we have to collect similar RNA sequences with a common secondary structure for the analyses of a new non-coding RNA without knowing the exact secondary structure itself. Therefore, the sequence comparison in ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006